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Abstract 

This  project  developed  theory  and  systems  support  to  aid  in  the  construction  of  adaptive,  survivable 
distributed  systems.  The  systems  are  designed  to  run  in  highly  dynamic  environments  such  as  the 
internet,  wireless  networks,  and  sensor  networks.  Participating  processes  may  join,  leave,  and 
fail  during  computation.  The  systems  that  were  considered  solve  problems  of  data  sharing  and 
management,  resource  sharing  and  management,  communication,  and  coordination. 

Specifically,  the  project  involved  developing  reusable  “building  blocks” — global  service  specifications 
and  distributed  algorithms — for  dynamic  distributed  systems.  The  work  included  an  extensive 
study  of  view-oriented  group  communication  services  and  algorithms,  which  is  now  “transitioning” 
into  use  at  Lincoln  Laboratories.  A  major  focus  was  on  design  and  analysis  of  algorithms  for 
implementing  reliable  atomic  shared  memory  in  highly  dynamic  networks.  Other  algorithmic  work 
covered  dynamic  algorithms  for  atomic  broadcast,  scalable  reliable  multicast,  and  topology  control. 

In  addition,  the  project  produced  results  on  mathematical  semantic  foundations  to  support  mod¬ 
eling  and  analysis  of  highly  dynamic  distributed  systems,  and  on  tools  to  support  this  effort. 
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2  Objectives 


This  project  was  intended  to  develop  theory  and  systems  support  to  aid  in  the  construction  of 
adaptive,  survivable  distributed  systems.  The  systems  are  designed  to  run  in  highly  dynamic 
environments  such  as  the  internet,  wireless  networks,  and  sensor  networks.  Participating  processes 
may  join,  leave,  and  fail  during  computation.  The  systems  to  be  considered  solve  problems  of  data 
sharing  and  management,  resource  sharing  and  management,  communication,  and  coordination. 

Specifically,  the  project  involved  developing  reusable  “building  blocks” — global  service  specifications 
and  distributed  algorithms — for  dynamic  distributed  systems.  The  work  included  an  extensive 
study  of  view-oriented  group  communication  services  and  algorithms.  A  major  focus  was  on  design 
and  analysis  of  algorithms  for  implementing  reliable  atomic  shared  memory  in  highly  dynamic 
networks.  Other  problems  to  be  considered  included  atomic  broadcast,  scalable  reliable  multicast, 
topology  control,  and  clock  synchronization. 

In  addition,  the  project  was  intended  to  produce  results  on  mathematical  semantic  foundations 
to  support  modeling  and  analysis  of  highly  dynamic  distributed  systems,  and  to  produce  tools  to 
support  this  elfort. 


3  Status  of  Effort 

The  project  is  now  completed.  Many  of  the  results  obtained  during  this  project  have  suggested 
new  research  directions.  These  are  detailed  in  om  newly-approved  AFOSR  proposal  [27]. 


4  Accomplishment s/New  Findings 

Our  results  fall  into  two  categories:  algorithms,  and  formal  modeling  and  analysis  techniques. 

4.1  Algorithms 

Our  work  on  algorithms  includes  results  on  group  communication,  reliable  broadcast,  data  sharing, 
namespace  management,  and  topology  control. 

4.1.1  Group  Communication 

Group  communication  services,  such  as  those  provided  by  Isis  [5],  Transis  [12],  and  many  other 
systems  (e.g.,  [31,  14,  4,  42,  19,  43]),  are  high-level  communication  services  that  allow  processes  to 
communicate  with  named  groups  of  processes  with  changing  membership  sets.  These  services  are 
based  on  group  membership  services,  which  maintain  views  of  the  current  group  membership  and 
inform  clients  about  changes  in  the  views.  Group  communication  (GC)  services  guarantee  that 
communication  respects  views,  and  integrate  multicast  communication  with  the  views.  Within 
each  view,  they  typically  provide  strong  message  ordering  and  reliability  guarantees  such  as  atomic 
broadcast,  causal  broadcast,  or  virtual  synchrony. 
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Our  research  provided  a  theoretical  foundation  for  this  area.  Namely,  we  developed  precise  defi¬ 
nitions  of  properties  to  be  satisfied  by  group  communication  systems,  and  developed  methods  of 
modeling  and  analyzing  GC  systems.  We  developed  and  analyzed  new  algorithms  for  implementing 
GC  services  and  for  building  applications  on  top  of  GC  services.  We  found  errors  in  some  existing 
implementations. 


Partitionable  group  communication  services:  Fekete,  Lynch,  and  Shvartsman  [15,  5,  16] 
developed  an  I/O  automaton  specification  for  a  prototype  GC  service  patterned  on  services  used 
in  Isis  [5]  and  Transis  [12].  We  called  this  service  VS  (for  “view  synchrony”).  The  VS  service  is 
partitionable:  it  allows  disjoint  views  of  the  same  group  to  exist  simultaneously.  We  also  developed 
an  algorithm  that  uses  VS  to  implement  a  totally  ordered  broadcast  service  based  on  earlier  work  by 
Keidar  and  Dolev  [21]  and  by  Amir  et  al.  [1].  We  proved  correctness  and  performance  properties  for 
the  algorithm,  the  latter  conditioned  on  behavior  of  the  underlying  network.  Archer  later  expanded 
and  checked  most  of  the  correctness  proofs  using  the  PVS  theorem  prover  [2].  We  also  developed 
computation  workload  distribution  algorithms  based  on  VS  [22,  11].  Other  researchers  adopted  our 
methods  to  model  and  analyze  other  GC  algorithms  and  services  [24,  7]. 

Working  with  system  developers  at  Cornell  [20],  we  used  our  methods  to  model  and  analyze  Birman 
and  Hayden’s  Ensemble  system  [43,  19].  Ensemble  is  organized  in  layers,  where  successive  layers 
introduce  successively  stronger  ordering  and  reliability  properties.  We  developed  global  specifica¬ 
tions  for  two  key  Ensemble  layers — the  virtual  synchrony  layer  and  another  layer  that  cornbines 
virtual  synchrony  with  a  consistent  total  ordering  property — and  we  modeled  the  Ensemble  al¬ 
gorithm  that  bridges  between  the  layers.  While  attempting  a  proof,  we  discovered  a  significant 
logical  error  in  the  state  exchange  sub-algorithm.  This  was  repaired  in  the  actual  system,  and  we 
subsequently  developed  models  and  proofs  for  the  repaired  system.  The  same  error  was  found  to 
exist  in  Ensemble’s  predecessor  GC  system,  Horus  [42]. 


Non-partitionable  services;  Partitioning  is  undesirable  for  applications  with  strong  consistency 
requirements,  such  as  totally-ordered  broadcast  communication  or  coherent  data  management. 
Following  an  approach  of  Yeger  Lotem  et  al  [26],  De  Prisco,  Fekete,  Lynch,  and  Shvartsman  defined 
a  new  GC  service,  DVS,  which  prevents  partitioning,  allowing  only  one  primary  view  to  exist  at  a 
time.  Primary  views  may  evolve  over  time,  as  long  as  each  view  contains  members  of  the  previous 
view.  We  also  developed  (and  proved  correctness  of)  an  algorithm  that  implements  DVS,  and  an 
algorithm  that  uses  DVS  to  implement  consistent  totally  ordered  broadcast.  Ingols  implemented 
several  primary  service  algorithms  on  a  LAN  and  compared  their  performance  experimentally  [56]. 

We  extended  this  work  to  allow  views  to  contain  extra  structure  useful  for  group-oriented  appli¬ 
cations,  for  example,  a  distinguished  leader  or  sets  of  quorums  [10,  9].  Such  structure  can  be 
used,  for  example,  to  achieve  consistency  and  availability  in  the  face  of  transient  changes  in  the 
set  of  participating  processes.  We  call  such  augmented  views  configurations.  We  specified  two 
configmation-oriented  GC  services,  which  allow  configurations  to  change,  provided  that  each  con¬ 
figuration  satisfies  certain  intersection  properties  with  respect  to  the  previous  configuration,  and 
we  designed  algorithms  that  implement  our  services,  by  extending  the  algorithm  from  [26].  Fur¬ 
thermore,  we  developed  two  consistent  replicated  data  algorithms  that  use  our  services:  one  based 
on  Lamport’s  Paxos  algorithm  [23],  and  the  second  based  on  a  dynamic,  multi- writer  version  of  the 
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algorithm  of  Attiya  et  al.  [3].  Both  algorithms  tolerate  long-term  changes  in  the  set  of  participating 
processes  (using  reconfiguration)  and  transient  changes  (using  quorums). 


Scalable  group  communication:  Traditional  implementations  of  GC  services  have  been  de¬ 
signed  for  local  area  networks,  and  their  communication  and  latency  requirements  do  not  scale 
well  to  larger  networks.  Khazan  and  Keidar  developed  a  scalable  GC  service,  intended  for  use 
in  wide-area  networks  [31,  9,  57,  8]  Their  algorithm  uses  a  separate  scalable  membership  service 
[75],  implemented  on  a  small  set  of  membership  servers.  Multicast  communication,  however,  is 
implemented  on  all  the  nodes.  In  the  new  algorithm,  a  view  change  involves  only  one  round  for 
state  exchange,  and  that  round  is  conducted  in  parallel  with  the  membership  service’s  agreement 
on  views.  Moreover,  new  participants  can  join  during  view  formation.  Khazan  proved  correctness 
(safety  and  liveness)  of  the  new  algorithm  [8],  and  analyzed  its  performance,  conditioned  on  limita¬ 
tions  on  timing  and  failure  behavior.  In  particular,  he  analyzed  the  time  from  when  the  underlying 
network  stabilizes  until  the  GC  service  announces  a  new  view,  and  also  analyzed  message  latency  in 
stable  situations.  Khazan  also  designed,  modeled,  and  analyzed  a  data-sharing  application  running 
on  top  of  the  new  GCS.  Tarashchanskiy  implemented  the  new  algorithm  [63]. 


Other  work:  Keidar,  et  al.  have  written  a  comprehensive  article  for  Computing  Surveys  defining 
and  classifying  the  interesting  guarantees  provided  by  group  communication  services  [8].  Other  work 
from  our  research  group  on  group  communication  systems  appears  in  [41,  40,  18]. 


4. 1,2  Reliable  Broadcast 

Early-delivery  atomic  broadcast:  Bar- Joseph,  Keidar,  and  Lynch  developed  a  new,  fast  algo¬ 
rithm  for  atomic  broadcast  in  a  dynamic  setting,  where  processes  may  join,  leave,  or  fail  [19,  67]. 
The  problem  is  to  guarantee  that  participants  receive  sequences  of  messages  that  are  consistent 
with  a  single  global  message  ordering;  in  particular,  they  must  receive  the  same  final  message  from 
a  failed  process.  In  the  absence  of  failures,  our  algorithm  guarantees  constant  latency,  even  when 
participants  join  and  leave.  In  the  presence  of  failures,  the  latency  is  linear  in  the  number  of  failures 
that  actually  occur.  The  main  difficulties  are  that  the  underlying  network  does  not  guarantee  a 
single  total  order,  and  that  different  processes  may  receive  different  final  messages  from  a  failed 
process.  So,  processes  coordinate  message  delivery:  They  divide  time  into  slots,  assign  messages  to 
slots,  and  deliver  messages  slot-by-slot.  Processes  determine  the  members  of  each  slot,  and  deliver 
messages  only  from  members.  To  decide  on  which  processes  fail  in  each  slot,  processes  engage  in  a 
distributed  consensus  protocol.  This  requires  a  new  kind  of  consensus  service,  in  which  participants 
do  not  know  who  the  other  participants  are.  We  defined  this  new  consensus  problem,  and  devel¬ 
oped  a  new  early-stopping  algorithm  to  solve  it.  Our  algorithm  improves  upon  previously-suggested 
algorithms  using  group  communication. 

Scalable  reliable  multicast:  Livadas  completed  his  PhD  thesis  [58],  on  analyzing  and  comparing 
reliable  multicast  protocols.  He  designed  a  new  caching-enhanced  version  of  the  well-known  SRM 
protocol,  which  he  calls  CESRM.  His  analysis  shows  that,  in  cases  when  the  expedited  recovery 
occurs,  the  latency  is  only  about  one  fourth  of  that  of  un-enhanced  SRM.  By  analyzing  real  IP 
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multicast  traces,  he  has  shown  that  expedited  recoveries  occur  about  one  third  of  the  time.  This 
work  is  reported  in  [36,  58,  78,  37,  79,  77]. 


4.1.3  Data-Sharing  Services 

Reconfigurable  atomic  memory  (Rambo):  Lynch  and  Shvartsman  defined  a  new,  reconfig- 
urable  algorithm  for  implementing  atomic  read/ write  shared  memory  in  dynamic  networks  [43,  81], 
for  example,  mobile  settings  or  peer-to-peer  settings.  The  algorithm,  which  we  call  Rambo  (for 
“Reconfigurable  Atomic  Memory  for  Basic  Objects”),  tolerates  short-term  changes  by  using  quo¬ 
rums,  and  tolerates  long-term  changes  by  reconfiguring.  Reconfiguration  occurs  on-the-fiy,  without 
heavyweight  view  change  as  in  group  communication  systems.  The  algorithm  maintains  atomicity 
across  configuration  changes. 

The  starting  point  for  this  algorithm  is  a  static  two-phase  quorum-based  implementation  for 
read/write  shared  memory,  as  described  in  [28,  3].  In  the  first  phase  of  a  read  or  write  opera¬ 
tion,  a  value  and  associated  tag  are  read  from  a  read-quorum  of  replicas,  and  in  the  second  phase, 
a  value  and  tag  are  propagated  to  a  write-quorum.  A  write  operation  uses  the  first  phase  to  de¬ 
termine  the  largest  tag,  picks  a  larger  tag,  and  uses  the  second  phase  to  write  the  new  value  and 
tag.  A  read  operation  uses  the  first  phase  to  determine  the  latest  value  cind  tag,  and  uses  the 
second  phase  to  propagate  this  information,  before  returning  the  value  to  its  client.  Operations 
may  proceed  concurrently;  quorum  intersection  properties  imply  that  the  shared  data  appears  to 
be  atomic.  c 

Rambo  adapts  this  strategy  for  use  in  a  dynamic  setting,  by  allowing  the  system  to  reconfigure  its 
sets  of  read-quorums  and  write-quorums.  It  allows  any  member  of  the  current  quorum  configuration 
to  propose  a  new  configuration;  conflicts  are  resolved  using  a  distributed  consensus  algorithm  such  as 
Paxos  [23].  Although  consensus  is  a  heavyweight  mechanism,  it  is  used  here  only  for  reconfiguration, 
which  is  an  infi-equent  operation.  Also,  reconfigurations  do  not  significantly  delay  read  and  write 
operations,  unlike  what  happens  in  group-communication-based  approaches. 

A  process  conducting  a  read  or  write  operation  runs  an  algorithm  similar  to  the  static  two-phase  al¬ 
gorithm  described  above,  using  the  current  configuration.  When  a  new  configuration  is  determined, 
the  read  or  write  operation  continues,  using  the  new  configuration  in  addition  to  the  old  one;  this 
may  require  additional  work  to  access  processes  needed  for  new  quorums.  An  old  configuration  may 
be  abandoned,  after  execution  of  a  two-phase  “garbage-collection”  procedure.  Garbage-collection 
is  performed  in  the  background,  concurrently  with  read  and  write  operations.  We  have  proved 
that,  under  “normal”  timing  assumptions,  garbage-collection  of  old  configurations  keeps  up  with 
introduction  of  new  configurations,  and  read  and  write  operations  take  time  at  most  8d,  where  d 
is  a  bound  on  the  message  delay.  Musial  and  Shvartsman  have  built  a  preliminary  implementation 
of  Rambo  in  a  LAN  [47],  and  are  beginning  to  carry  out  experiments. 

Rambo  II:  The  Rambo  algorithm  garbage-collects  old  quorum  configurations  sequentially,  which 
works  well  under  normal  timing  assumptions.  However,  if  normal  timing  assumptions  are  violated, 
configurations  can  pile  up,  and  garbage-collection  may  take  a  long  time  to  catch  up.  Recently, 
Gilbert,  Lynch,  and  Shvartsman  have  improved  Rambo  with  a  new  garbage-collection  procedure, 
which  handles  the  removal  'of  any  number  of  old  configurations  in  parallel  [26,  72]. 
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Separating  data  and  metadata:  Many  quorum-based  data-sharing  algorithms  (e.g.,  [3,  28,  43, 
81])  couple  data  closely  to  metadata  such  as  tags,  for  example,  sending  data  and  tags  together  in 
messages.  However,  in  practice,  metadata  information  may  be  much  smaller  than  the  actual  data. 
Fan  [23,  24,  53]  has  developed  a  variation  on  the  standard  quorum-based  read/write  algorithms 
that  separates  the  handling  of  the  metadata  from  that  of  the  actual  data.  Most  of  the  complicated 
processing  involves  the  metadata  only.  Handling  of  the  actual  data  is  quite  straightforward  and 
flexible:  all  that  is  necessary  is  to  maintain  enough  copies  of  the  right  versions,  and  to  make  them 
easy  to  find.  The  algorithm  is  efficient.  For  example,  in  each  logical  read  operation,  only  one 
physical  read  of  the  data  and  no  physical  writes  are  performed.  This  is  a  substantial  improvement 
over  most  other  quorum-based  atomic  data  algorithms,  which  read  and  write  data  to  two  quorums 
during  a  logical  read.  Each  read  or  write  operation  uses  three  phases.  While  this  is  more  than  the 
two  phases  required  in  some  other  algorithms  [3,  28,  43,  81],  some  of  the  phases  send  only  short 
messages,  and  so  should  take  negligible  time  in  practice.  Fan’s  algorithm  is  designed  for  a  static 
environment.  But  because  it  allows  data  to  be  stored  in  any  (large  enough)  set  of  processes,  and  not 
only  in  quorums,  it  supports  reconfiguration  of  replicas  in  response  to  failures  or  changing  network 
conditions.  Furthermore,  the  reconfiguration  is  cheap  enough  to  be  applied  on  a  per-operation 
basis.  Fan  has  also  proved  two  lower  bounds  that  show  that  some  of  the  costs  incurred  by  his 
algorithm  are  necessary.  In  particular,  he  has  proved  lower  bounds  on  the  number  of  replicas  that 
are  written  by  read  operations,  and  on  the  number  of  versions  of  an  object  that  must  be  maintained. 


GeoQuorums:  Dolev,  Gilbert,  Lynch,  Shvartsman,  and  Welch  have  developed  a  new  approach 
to  maintaining  atomic  data  in  mobile  ad  hoc  networks,  or  in  networks  that  consist  of  a  combination 
of  mobile  nodes  and  fixed  nodes  [21,  69,  4].  This  approach  involves  building  an  abstract  layer 
consisting  of  virtual  focal  point  processes^  each  associated  with  a  fixed  geographical  region.  We 
believe  that  introducing  such  a  layer  provides  a  viable,  practical  method  of  programming  mobile 
ad  hoc  networks. 

In  the  GeoQuorums  algorithm,  the  focal  point  processes  engage  in  a  (mostly  static)  quorum-based 
algorithm  to  implement  atomic  read/write  objects.  This  algorithm  includes  certain  optimizations 
over  the  two-phase  protocols  used  in  [28,  3,  43,  81]:  the  first  phase  of  a  write  is  omitted  in  favor 
of  a  tag-selection  protocol  based  on  synchronized  local  clocks,  and  the  second  phase  of  a  read  is 
omitted  when  the  operation  learns  that  the  relevant  information  has  already  been  propagated  by 
another  operation.  Also,  quorums  are  chosen  for  good  performance  in  the  geographical  setting. 
Focal  point  processes  are  implemented  using  either  fixed  or  mobile  nodes;  in  the  case  of  mobile 
nodes,  replicated  state  machine  techniques,  based  on  local  broadcast,  are  used. 


Brewer’s  conjecture:  Gilbert  and  Lynch  [6]  have  proved  several  versions  of  a  “folklore”  result 
stated  a  few  years  ago  by  Eric  Brewer:  that  it  is  impossible,  in  a  fault-prone  communication  network, 
to  guarantee  a  combination  of  atomicity  (consistency)  and  availability.  The  precise  statement  of 
this  result  depends  on  assumptions  about  timing  and  failures;  the  paper  [6]  obtains  results  for 
several  sets  of  assumptions. 
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4.1.4  Peer-to-Peer  Networks 


Atomic  shared  memory:  At  the  first  International  Workshop  on  Peer-to-Peer  systems  [33], 
Lynch,  Malkhi,  and  Ratajczak  [44]  proposed  a  technique  for  implementing  atomic  read/write  mem¬ 
ory  in  ring-like  peer-to-peer  networks  like  the  classical  Chord  network  [39].  Our  method  integrates 
object-management  facilities  directly  into  the  basic  namespace-management  protocols,  by  system¬ 
atically  passing  control  of  objects  to  new  processes  when  the  ring  changes. 


MultiChord:  Lynch  and  Stoica  [82]  designed  and  evaluated  a  peer-to-peer  namespace  man¬ 
agement  service  called  MultiChord  [82].  This  algorithm,  based  on  Chord  [39],  is  tuned  for 
performance  in  a  steady  state  situation,  where  clients  join  and  fail  at  a  bounded  rate.  In  particular, 
(1)  each  node  maintains  table  entries  for  several  neighbors  of  each  power-of-two  successor  in  the 
namespace,  as  well  as  for  its  own  neighbors,  and  (2)  each  node  delays  joining  until  its  table  has 
been  populated  with  all  the  required  entries. 


4,1.5  Topology  Control  in  Mobile  ad  hoc  Networks 

Preserving  mulltiple  connectivity:  Finally,  several  PhD  students  working  with  our  group  have 
been  studying  the  problem  of  controlling  network  topology  in  mobile  ad  hoc  networks,  by  adjusting 
the  nodes’  power  settings  [16,  27].  In  [16],  they  presented  a  distributed  algorithm  for  adjusting 
power  while  preserving  /^-connectivity  of  the  network  communication  graph,  using  “cone-based” 
techniques  based  on  work  of  Li  et  al.  [25].  They  also  proved  optimality  results  for  their  choice  of 
cone  sizes,  and  extended  the  results  to  three  dimensions.  Preserving  /^-connectivity  is  interesting 
because  a  /^-connected  graph  includes  enough  redundancy  to  support  message  routing  in  the  face 
oik  —  1  node  failures.  In  [27],  they  developed  a  measure  of  total  power  consumption  and  developed 
several  algorithms  whose  total  power  consumption  approximates  the  global  optimum. 


4,1.6  Sensor  Networks 

We  began  work  on  algorithms  for  networks  of  sensors,  focusing  on  problems  of  time  synchronization, 
message  dissemination,  tracking,  and  routing.  We  are  currently  writing  papers  on  new  algorithms 
for  tracking  and  time  synchronization. 

4.2  Modeling  and  Analysis  Techniques 

Our  group  has  a  long  history  of  developing  fundamental  modeling  frameworks  for  distributed  sys¬ 
tems,  based  on  interacting  state  machines  (various  forms  of  I/O  automata). 


Hybrid  I/O  Automata:  Recent  accomplishments  in  this  direction  include  a  comprehensive 
development  and  presentation  of  the  Hybrid  I/O  Automata  (HIOA)  modeling  framework  [10], 
by  Lynch,  Segala,  and  Vaandrager.  This  framework  supports  description  and  analysis  of  hybrid 
systems  j  which  may  consist  of  real-world  components  such  as  land-based  vehicles  or  airplanes,  as 
well  as  computer  components.  The  framework  allows  description  of  continuous  state  evolution 
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as  well  as  discrete  transitions.  Hybrid  systems  can  be  composed  to  build  larger  systems,  with 
continuous  as  well  as  discrete  interaction  between  components,  and  can  be  modeled  at  multiple 
levels  of  abstraction.  We  have  used  the  HIOA  framework  for  many  case  studies  in  the  area  of 
transportation  control,  including  studies  of  the  TCAS  collision- avoidance  system  [34],  of  a  Quanser 
model  helicopter  system  [46],  and  of  a  circumvention  and  recovery  system  for  an  inertial  guidance 
system  [55].  We  have  also  used  HIOA  to  describe  mobile  computing  systems,  in  [4,  70]. 


Timed  I/O  Automata:  Kaynar,  Lynch,  Segala,  and  Vaandrager  are  currently  completing  a 
comprehensive  monograph  on  the  Timed  I/O  Automata  (TIOA)  modeling  framework,  intended 
to  support  the  description  and  analysis  of  timed  systems.  An  important  feature  of  this  model  is 
its  support  for  decomposing  timed  system  descriptions.  In  particular,  the  framework  includes  a 
notion  of  external  behavior  for  a  timed  I/O  automaton,  which  captures  its  discrete  interactions 
with  its  environment.  The  framework  also  defines  what  it  means  for  one  TIOA  to  implement 
another,  based  on  an  inclusion  relationship  between  their  external  behaviors,  and  defines  notions 
of  simulations,  which  provide  sufficient  conditions  for  demonstrating  implementation  relationships. 
The  framework  includes  a  composition  operation  for  TIOAs,  which  respects  external  behavior,  and 
a  notion  of  receptiveness,  which  implies  that  a  TIOA  does  not  “block”  the  passage  of  time.  The 
TIOA  framework  supports  the  statement  and  verification  of  safety  and  liveness  properties  for  timed 
systems.  It  defines  what  it  means  for  a  property  to  be  a  safety  or  a  liveness  property,  includes  basic 
results  about  safety/liveness  classification,  and  receptiveness  for  liveness  properties. 

The  TIOA  framework  is  formalized  as  a  special  case  of  the  HIOA  framework,  in  which  components 
interact  by  discrete  events  only  [73].  This  restriction  leads  to  simplifications  in  some  of  the  HIOA 
definitions  and  results,  especially  those  involving  composition.  Although  this  monograph  contains 
some  new  results  (e.g.,  about  liveness  properties)  its  main  purpose  is  to  serve  as  a  “user  manual”  to 
explain  practical  methods  of  modeling  and  analyzing  timing-based  distributed  systems.  A  summary 
version  appeared  in  RTSS ’03  [30]. 


Probabilistic  I/O  Automata:  Lynch,  Segala,  and  Vaandrager  have  begun  working  on  com¬ 
positional  models  of  systems  that  include  probabilistic  behavior,  based  on  earlier  definitions  of 
Probabilistic  I/O  Automata  (PIOA)  by  Segala  [35,  36,  37]  (also  see  Stoelinga  [38]).  In  a  prelim¬ 
inary  paper  [45],  we  have  characterized  directly,  in  terms  of  a  simulation  relation,  a  notion  of 
implementation  for  PIOAs  that  was  previously  defined  only  implicitly  [35,  36,  37].  Perhaps  surpris¬ 
ingly,  this  turns  out  to  be  a  very  fine  relation,  exposing  much  of  the  internal  branching  structme 
of  the  automata.  We  are  currently  working  on  developing  coarser  implementation  relations,  based 
on  restricting  the  scheduling  of  system  components. 


Dynamic  I/O  Automata:  Finally,  Attie  and  Lynch  [65]  have  developed  a  Dynamic  I/O  Au~ 
tomata  modeling  framework,  which  allows  component  capabilities  (signatures)  to  change  and  in¬ 
cludes  explicit  actions  representing  process  creation  and  destruction  [66].  This  work  is  somewhat 
related  to  Milner’s  Pi-Calculus  [29,  30];  however,  our  specific  choices  of  primitive  notions  to  include 
in  the  model  are  quite  different,  and  our  focus  is  on  mathematical  semantics  rather  than  on  formal 
notation  and  process  calculi.  Early  versions  of  this  work  appeared  in  [15,  14]. 
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Analysis  methods:  As  a  byproduct  of  our  work  on  scalable  group  communication,  described  in 
Section  4.1.1,  we  found  new  methods  of  incremental  modeling  and  proof  for  distributed  system  com¬ 
ponents  modeled  as  I/O  automata  [8].  These  techniques  are  inspired  by  inheritance  notions  from 
object-oriented  programming.  These  techniques  support  incremental  construction  of  specifications, 
algorithm  descriptions,  and  abstraction  proofs  showing  that  algorithms  meet  their  specifications. 

We  used  these  methods  to  develop  models  and  safety  proofs  for  our  scalable  GC  services. 

We  analyzed  the  performance  of  several  complex  algorithms  using  conditional  methods  [5,  57,  43, 

81,  26,  72,  82].  In  particular,  we  proved  latency  bounds  in  situations  in  which  the  underlying  system 
exhibits  completely  benign  behavior  from  some  point  onward  [5,  57,  82].  We  also  proved  bounds 
for  situations  in  which  the  system  exhibits  good  steady-state  behavior  throughout  the  execution 
[43,  81,  82] — not  completely  benign  behavior,  but  rather,  behavior  in  which  the  rate  of  change 
(joins  and  failures)  is  bounded.  We  also  analyzed  some  protocols  in  situations  in  which  timing  and 
failure  behavior  is  arbitrary  up  to  some  point,  then  follows  a  steady-state  pattern  from  that  point 
onward  [26,  72].  Steady-state  analysis  is  still  difficult  to  do;  more  remains  to  be  done  to  make  these 
methods  tractable. 

Tools:  Garland,  Kaynar,  and  Lynch,  working  with  many  students,  have  been  designing,  imple¬ 
menting,  and  testing  the  experimental  lOA  specification  language  and  toolset.  This  language  and 
toolset  are  intended  to  support  distributed  programming  based  on  high-level,  composable  mathe¬ 
matical  models.  Tools  include: 

1.  A  simulator,  which  is  capable  of  simulating  algorithms  using  multiple  levels  of  abstraction. 

Early  versions  of  the  simulator  were  designed  and  built  by  Chefter  [6],  Ramirez  [61],  and 
Dean  [52].  The  latest  version  of  the  simulator,  by  Solovey  [62],  is  also  capable  of  simulating 
the  composition  of  several  automata.  The  latest  reference  manual  for  the  simulator  appears 
in  [74]. 

2.  A  connection  to  the  Daikon  invariant  discovery  tool  [13]. 

3.  A  connection  to  the  Larch  theorem-prover  [17,  51,  20]. 

4.  A  connection  to  the  Isabelle/HOL  theorem-prover  [48,  60].  Ne  Win  used  this  connection  for 
experiments  in  reducing  the  amount  of  human  interaction  required  to  discover  and  prove  inter¬ 
esting  properties  of  distributed  algorithms  [11].  These  experiments  used  invariants  suggested 
by  Daikon. 

5.  A  “Composer”  tool,  an  addition  to  the  lOA  front  end,  which  expands  the  definition  of  a 
composite  automaton  into  an  equivalent  primitive  automaton  [100] . 

6.  A  code  generator  for  distributed  code  (to  run  in  our  LAN)  is  still  in  progress  [98,  99]. 

The  latest  release  of  the  lOA  language  and  toolset  is  available  at  URL  http :  //theory .  Ics .  mit .  edu/tds/ioa 
It  includes  a  comprehensive  user  guide  and  reference  manual  [97] 

These  tools  do  not  (yet)  support  additions  to  the  model  such  as  timing,  continuous  behavior,  or 
probabilistic  behavior.  We  are  currently  extending  the  lOA  language  with  notations  for  timing 
behavior  based  on  the  theoretical  framework  in  [73] ,  and  plan  to  extend  the  tools  to  accommodate 
this  extension. 
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Ireland,  June  2000. 


Carolos  Livadas:  Modeling,  Analyzing,  and  Improving  SRM;  A  Formal  Venture  into  the  Realm 
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Modeling,  Analyzing,  and  Improving  SRM;  A  Formal  Venture  into  the  Realm  of  Reliable  Multicast. 
BBN  Technologies,  Internetwork  Research  and  Mobile  Systems  Departments,  May  7,  2003. 

Modeling,  Analyzing,  and  Improving  SRM;  A  Formal  Venture  into  the  Realm  of  Reliable  Multicast. 
Lincoln  Laboratories,  April  11,  2003. 

A  Formal  Venture  into  Reliable  Multicast  Territory.  Formal  Techniques  for  Networked  and  Dis¬ 
tributed  Systems  -  FORTE  2002,  Houston,  Texas,  November  2002. 

Designing  a  Caching-Based  Reliable  Multicast  Protocol.  DSN’Ol  in  Sweden. 


Nancy  Lynch:  Reconfigurable  Atomic  Memory  for  Dynamic  Networks.  Ecole  Polytechnique, 
Lausanne,  Prance,  June  2003. 

RAMBO:  A  Reconfigurable  Atomic  Memory  Service  for  Dynamic  Networks.  Boston  University, 
April  2003. 

Hybrid  Input/Output  Automata:  Theory  and  Applications.  19th  Conference  on  the  Mathematical 
Foundations  of  Programming  Semantics,  March  2003. 

RAMBO:  A  Reconfigurable  Atomic  Memory  Service  for  Dynamic  Networks.  Grace  Hopper  Distin¬ 
guished  Lecturer,  University  of  Pennsylvania,  February  2003. 

RAMBO:  A  Reconfigurable  Atomic  Memory  Service  for  Dynamic  Networks.  University  of  Califor¬ 
nia,  San  Diego,  January  2003. 

RAMBO:  A  Reconfigurable  Atomic  Memory  Service  for  Dynamic  Networks.  16th  International 
Symposium  on  Distributed  Computing  (DISC),  Toulouse,  Prance,  October  2002. 

RAMBO:  A  Reconfigurable  Atomic  Memory  Service  for  Dynamic  Networks.  Distinguished  Lec¬ 
turer,  Johns  Hopkins  University,  September  2002. 

Modeling  Distributed  Systems  Using  I/O  Automata.  Draper  Labs,  August  30,  2002. 

New  Directions  for  NEST  Research.  Building  Block  for  High-Performance,  Fault-Tolerant  Dis¬ 
tributed  Systems,  NEST  Darpa  PI  meeting,  July  12,  2002. 

Impossibility  of  Consensus  With  One  Faulty  Process.  High  school  women’s  summer  program,  MIT, 
July  1,  2002. 
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Eaxly-Delivery  Dynamic  Atomic  Broadcast.  U.  Salerno,  Salerno,  Italy,  June  10,  2002. 

Building  Blocks  for  High  Performance,  Fault-Tolerant  Distributed  Systems.  AFOSR  PI  meeting, 
June  3,  2002. 

Hybrid  I/O  Automata,  MIT  SEC  meeting.  May  17,  2002. 

Early-Delivery  Dynamic  Atomic  Broadcast,  NYU,  May  3,  2002. 

Early-Delivery  Dynamic  Atomic  Broadcast.  U.  Nijmegen,  April  22,  2002. 

Research  Directions  in  Computer  System  Security,  NSF  Workshop,  February  14,  2002. 

Early-Delivery  Dynamic  Atomic  Broadcast.  UC  Berkeley,  February  12,  2002. 

Early-Delivery  Dynamic  Atomic  Broadcast.  Tufts  University,  February  6,  2002. 

Implementing  Atomic  Objects  in  a  Dynamic  Environment.  Leslie  Lamport’s  60th  Birthday  Cele¬ 
bration.  PODC’Ol,  Newport,  R.I.  August,  2001. 

Dynamic  Input/Output  Automata:  a  Formal  Model  for  Dynamic  Systems.  PODC’Ol,  Newport, 
R.I.  August,  2001. 

Hybrid  I/O  Automata,  a  Mathematical  Model  for  Hybrid  Systems.  Albert  Meyer’s  60th  Birthday 
Celebration.  Boston,  MA.  June,  2001. 

Hybrid  I/O  Automata,  a  Mathematica.1  Model  for  Hybrid  Systems.  Laboratory  for  Information 
and  Decision  Systems,  MIT.  May,  2001. 

Totally  Ordered  Multicast  with  QoS.  Workshop  on  Perspectives  on  Algorithms  and  Distributed 
Algorithms.  Luminy,  Prance.  May,  2001. 

Hybrid  I/O  Automata,  a  Mathematical  Model  for  Hybrid  Systems.  Workshop  on  Perspectives  on 
Algorithms  and  Distributed  Algorithms.  Luminy,  FVance.  May,  2001. 

Hybrid  I/O  Automata,  a  Mathematical  Model  for  Hybrid  Systems.  University  of  Pennsylvania, 
Phila.,  PA.  April,  2001. 

Hybrid  Input /Output  Automata,  Revisited,  Hybrid  Systems:  Computation  and  Control.  Rome, 
Italy.  March,  2001. 

Defining  the  Oxygen  Software  Architecture.  Oxygen  brainstorm  meeting,  MIT.  February,  2001. 

Hybrid  Input/Output  Automata,  Revisited,  Hybrid  Systems:  Computation  and  Control.  Rome, 
Italy.  March,  2001. 

Reliable  Group  Communication:  A  Mathematical  Approach.  Distinguished  Lecture.  Purdue  Uni¬ 
versity.  November,  2000. 


Nancy  Lynch  and  Stephen  Garland:  Modeling  and  Analyzing  Distributed  Systems  using  I/O 
Automata.  Draper  R&D  Kickoff,  August  2002. 


Sayan  Mitra:  Safety  Verification  of  Model  Helicopter  Controller  using  Hybrid  Input/Output 
Automata.  6th  International  Workshop,  HSCC’03,  Prague,  the  Czech  Republic  April  3-5,  2003. 
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David  Ratajczak  Atomic  Data  Access  in  Content-Addressable  Networks.  MIT  Workshop  on 
Peer-to-Peer  Computing,  March,  2002. 


Alex  Shvartsman  Distributed  Cooperation  in  the  Presence  of  Failures  and  Delays.  Computer 
Science  Colloquium,  Yale  University,  2000. 

Distributed  Cooperation  in  the  Presence  of  Failures  and  Delays,  Computer  Science  Seminar,  Ecole 
Poly  technique.  Prance,  2000. 

7.2  Consultative  and  advisory  functions  to  other  laboratories  and  agencies, 
especially  Air  Force  and  other  DoD  laboratories 

See  “Transitions”,  below,  for  further  details  about  these  projects. 

Nancy  Lynch  and  Stephen  Garland:  Assisted  developers  at  Draper  Laboratories  in  a  project  to 
model  and  analyze  a  critical  military  system.  Fall,  2001-Spring,  2003.  Contact:  Joe  Kochocki. 

Nancy  Lynch:  Occasional  contact  with  Lincoln  Labs,  through  AFOSR-funded  ex-PhD  student 
Roger  Khazan.  Working  on  design  of  a  chat-like  communications  application  for  air  force  missions. 
In  progress. 

Nancy  Lynch:  Consultant  to  director  of  the  Division  of  Computer  and  Information  Sciences  and 
Engineering,  NSF.  September  2000-June  2002  Contact:  Dr.  Ruzena  Bajcsy,  NSF. 

Nancy  Lynch:  Technical  advisor,  Centrata  corporation. 

Sayan  Mitra:  Working  at  Naval  Research  Laboratory  summer  2003,  with  Dr.  Myla  Archer.  Devel¬ 
oping  tools  for  modeling  and  analyzing  timing-based  and  hybrid  systems  of  use  in  the  Navy. 

7.3  Transitions 

Transition  1: 

(a)  Customer: 

Center  for  High  Assurance  Computer  Systems  Naval  Research  Laboratory  Code  5546  4555  Overlook 
Avenue,  S.W.  Washington,  DC  20375-5320 

Dr.  Myla  Archer,  archer@itd.nrl.navy.mil,  202-767-2389 

Dr.  Connie  Heitmeyer.  Head,  Software  Engineering,  heitmeyer@itd.nrl.navy.mil,  202-767-3596 

(b)  Research  result: 

The  definition  of  our  timed  I/O  automaton  mathematical  model.  Methods  of  modeling  timing 
constraints.  Inductive  proof  methods  for  proving  invariant  assertions,  simulation  relationships,  and 
timing  properties.  The  new  definitions  of  hybrid  I/O  automata, 

(c)  Application: 

Developers  at  the  NRL  have  used  our  timed  I/O  automaton  model  and  its  proof  methods  as  the 
basis  of  the  TAME  tool  for  modeling  and  verifying  high  assurance  systems  of  interest  to  the  Navy. 
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One  of  my  PhD  students,  Sayan  Mitra,  worked  at  NRL  last  summer,  developing  the  tools  further, 
in  particular,  adding  support  for  hybrid  systems. 

(d)  What  was  accomplished: 

This  has  led  to  usable  modeling  and  proof  tools.  It  has  also  led  to  improvements  in  the  PVS 
theorem-prover,  which  is  used  by  the  TAME  system  for  carrying  out  formal  proofs.  Various  appli¬ 
cation  case  studies  have  been  carried  out. 

(e)  Why  it  is  important: 

Timed  and  hybrid  I/O  automata  form  a  sound  mathematical  foundation  for  distributed  systems 
with  timing  constraints.  Sound  methods  for  describing  and  analyzing  the  design  of  such  systems 
require  such  a  foundation.  Such  methods  can  be  used  to  make  the  process  of  developing  distributed 
systems  more  efficient  and  reliable. 

Transition  2: 

(a)  Customer: 

Nippon  Telephone  and  Telegraph  Communication  Science  Labs  2-4  Hikaridai,  Seika-cho,  Soraku- 
gun,  Kyoto,  Japan,  619-0237 

Ken  Mano,  mano@cslab.kecl.ntt.co.jp 

Yoshifumi  Manabe,  manabe@cslab.kecl.ntt.co.jp 

(b)  Research  result: 

Our  modeling  and  verification  techniques  for  distributed  systems,  based  on  (untimed)  I/O  au¬ 
tomata.  Our  formal  lOA  language  and  its  proof  tools. 

(c)  Application: 

Researchers  at  NTT  have  used  our  model,  techniques,  language,  and  tools  to  describe  the  agent 
programming  systems  that  they  are  building.  In  particular,  they  are  applying  our  results  to  the 
design  and  implementation  of  their  general  NePi2  agent  programming  system. 

(d)  What  was  accomplished: 

One  NTT  employee,  Yoshinobu  Kawabe,  produced  a  complete  model  of  the  NePi2  system  imple¬ 
mentation,  in  several  levels  of  abstraction,  using  lOA.  Furthermore,  he  carried  out  a  complete  proof 
of  correctness,  using  invariants  and  simulation  relations,  using  our  theorem-proving  tools. 

(e)  Why  it  is  important: 

Besides  what  was  accomplished  for  the  specific  NePi2  system,  this  work  demonstrates  the  feasibility 
of  validating  complete,  significant-sized  system  designs  using  our  methods. 

Transition  3: 

(a)  Customer:  Draper  Laboratories  555  Technology  Square  Cambridge,  MA  02139 
Joseph  Kochocki,  jkochocki@draper.com,  617-258-1285 

(b)  Reseaxch  result: 
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Our  modeling  and  verification  techniques  for  distributed  systems  with  timing  constraints,  based 
on  timed  I/O  automata.  In  particular,  our  methods  for  decomposing  systems  into  interacting 
components  and  into  levels  of  abstraction,  and  our  methods  of  modeling  time  deadlines  for  events. 

(c)  Application: 

We  worked  directly  with  Draper  Labs  software  developers  and  management,  performing  a  thorough 
analysis  of  a  simulated  version  of  a  major  system  that  Draper  is  developing.  MEng  student  Vida 
Ha,  working  as  a  Draper  Fellow,  carried  out  the  bulk  of  the  work. 

(d)  What  was  accomplished: 

A  complete  model  for  the  system,  at  multiple  levels  of  abstraction,  was  developed.  Statements 
of  key  properties  satisfied  at  the  various  levels,  were  produced.  These  properties  include  critical 
system  reliability  and  timing  properties.  Formal  proof  sketches  of  some  of  the  statements  were 
carried  out. 

(e)  Why  it  is  important: 

Systems  of  this  kind  are  critically  important  to  the  national  interest.  Yet  the  design  and  implemen¬ 
tation  techniques  used  currently  are  unwieldy  and  unreliable.  The  key-to  making  the  development 
process  more  tractable  and  more  reliable  is  to  raise  the  level  of  abstraction  at  which  the  systems 
are  described  and  analyzed. 

Transition  4: 

(a)  Customer:  MIT  Department  of  Aeronautics  and  Astronautics 
Prof.  Eric  Feron,  feron@mit.edu,  617-253-1991 

(b)  Research  result:  Our  mathematical  model  for  hybrid  continuous/discrete  systems,  which  we 
call  Hybrid  I/O  Automata.  Our  inductive  methods  for  proving  properties  of  hybrid  systems.  Our 
proposed  formal  language  for  describing  hybrid  I/O  automata. 

(c)  Application:  The  Aero/ Astro  department  uses  a  Quanser  model  helicopter  in  teaching  students 
how  to  design  helicopter  controllers.  In  order  to  protect  the  model  from  badly-designed  controllers, 
developers  in  Aero/ Astro  needed  to  add  a  “supervisory  controller”  module  to  their  system.  We 
worked  with  Aero/ Astro  developers  and  researchers  to  develop  a  design  for  such  a  controller.  We 
helped  to  document  the  design  in  terms  of  the  HIOA  model,  and  helped  to  prove  it  safe  using  our 
proof  methods. 

(d)  What  was  accomplished:  The  design  was  completely  developed  and  validated.  Implementation 
of  the  actual  controller  for  the  physical  helicopter  system  is  nearly  completed. 

(e)  Why  it  is  important:  This  demonstrates  the  feasibility  of  complete  modeling  and  analyses  of 
such  systems,  at  high  levels  of  abstraction.  Such  models  and  proofs  provide  high  assurance,  ahead 
of  time,  that  the  systems  will  work  as  intended.  They  also  make  explicit  the  precise  reasons  why 
the  system  behaves  correctly.  Such  models  can  be  reused  to  help  in  development  of  similar  systems. 

Transition  5: 

(a)  Lincoln  Laboratories 

Dr.  Roger  Khazan,  roger@lcs.mit.edu,  781-981-5976 
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Dr.  Cliff  Weinstein,  cjw@ll.mit.edu  781-981-7621 

(b)  Research  result:  Algorithms  and  service  definitions  for  group  communication  services.  Tech¬ 
niques  for  designing  and  analyzing  such  algorithms  and  services. 

(c)  Application:  Roger  Khazan,  who  worked  on  our  AFOSR  project  at  MIT,  joined  Lincoln  Labs 
in  2002  as  a  research  staff  member.  His  PhD  thesis  was  on  group  communication  algorithms  and 
services.  He  has  started  a  new  effort  at  Lincoln  Labs  to  develop  group  communication  services  for 
military  use,  specifically,  for  “chat-style”  services  to  aid  in  communication  during  missions.  Current 
communication  services  that  the  new  ones  are  designed  to  replace  have  problems  with  reliability 
and  efficiency;  it  is  hoped  that  the  application  of  group  communication  technologies  will  result  in 
more  robust,  more  efficient  communication  systems. 

(d)  What  was  accomplished:  Preliminary  design  discussions  are  under  way.  They  hope  to  develop 
a  credible  design  of  use  to  the  military. 

(e)  Why  it  is  important:  If  the  new  project  is  successful,  it  will  result  in  more  robust,  more  efficient 
communication  systems  for  military  missions. 


8  New  discoveries,  inventions,  patent  disclosures 

One  patent  was  issued:  “Model-Based  Software  Design  and  Validation”  by  Stephen  Garland  and 
Nancy  Lynch,  Sept.  11,  2001.  This  is  for  our  work  on  the  lOA  language  and  tools,  specifically  for 
a  design  methodology  combining  theorem-proving  and  code  generation,  for  distributed  systems. 


9  Honors  and  awards 

Toh  Ne  Win:  Received  an  award  from  the  MIT  EECS  department  last  spring  for  the  best  MEng 
thesis  in  computer  science,  2003. 

Seth  Gilbert:  His  paper  with  Nancy  Lynch,  Alex  Shvartsman,  and  Jennifer  Welch,  “GeoQuorums: 
Implementing  atomic  memory  in  ad  hoc  networks,”  was  selected  for  special  edition  of  Distributed 
Computing,  2003. 

Stephen  Garland  and  Nancy  Lynch:  lOA  work  was  featured  as  one  of  Technology  Review’s  “10 
Emerging  Technologies  That  Will  Change  the  World,  ”  February,  2003. 

Carl  Livadas:  Chosen  as  a  Barger  Fellow  at  BBN  (named  after  Dr.  James  Barger  who  is  a  distin¬ 
guished  research  scientist  at  BBN),  2003.  This  is  a  new  Fellowship  program  at  BBN  that  is  used 
to  attract  distinguished  new  PhD  graduates.  Carl  was  chosen  as  the  first  such  fellow  based  on  the 
work  he  did  in  the  TDS  group. 

Nancy  Lynch:  Chosen  as  Grace  Hopper  lecturer,  U.  Pennsylvania,  2003. 

Gregory  Chockler:  His  paper  with  Dahlia  Malkhi  “Active  Disk  Paxos  with  Infinitely  Many  Pro¬ 
cesses,”  was  selected  to  appear  in  the  PODC  2002  issue  of  the  Distributed  Computing  journal. 

Roger  Khazan:  His  paper  with  Idit  Keidar,  Roger  Khazan,  Nancy  Lynch  and  Alex  Shvartsman, 
“An  Inheritance-Based  Technique  for  Building  Simulation  Proofs  Incrementally,”  was  invited  for 
submission  to  TOSEM,  2002. 
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Nancy  Lynch:  Distinguished  Lecturer,  Johns  Hopkins  University,  2002’. 

Chris  Luhrs:  Winner  of  the  Seventh  Annual  Anna  Pogosyants  UROP  prize  for  his  work  with 
Stephen  Garland  and  Nancy  Lynch,  2002. 

Andrej  Bogdanov:  Honorable  mention  for  MEng  thesis,  2001. 

Idit  Keidar:  won  the  Alon  Fellowship  for  Junior  Faculty,  2001. 

Idit  Keidar:  awarded  a  Technion  Management  Career  Development  Chair,  2001. 

Nancy  Lynch:  Elected  Member  of  National  Academy  of  Engineering,  2001;  ACM  Fellow. 

Nancy  Lynch,  together  with  Michael  Fischer  and  Michael  Paterson:  won  the  second  annual  Princi¬ 
ples  of  Distributed  Computing  (PODC)  conference  award  for  “most  influential  paper  in  the  field”, 
2001.  They  won  this  award  for  their  1985  paper  “Impossibility  of  Distributed  Consensus  with  one 
Nonfaulty  Process”. 

Alex  Shvartsman:  won  the  Outstanding  Research  Award  for  Junior  Faculty  at  the  University  of 
Connecticut,  2001. 

Nancy  Lynch:  Distinguished  Lecturer,  Purdue,  University,  2000. 

Michael  Tsai:  Winner  of  the  Fifth  Annual  Anna  Pogosyants  UROP  Award  for  his  work  on  the  lOA 
Toolset  with  Joshua  Tauber  and  Nancy  Lynch,  2000. 

Gregory  Chockler:  His  paper  with  Danny  Dolev,  Roy  Friedman  and  Roman  Vitenberg,  “Imple¬ 
menting  Caching  Service  for  Distributed  CORBA  Objects,”  won  two  best  paper  awards  in  the 
Middleware  2000  Conference. 


